Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(icos): First phase of the firewall setup feature (DRE-258) #1451

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

DFINITYManu
Copy link
Contributor

@DFINITYManu DFINITYManu commented Sep 11, 2024

Per https://dfinity.atlassian.net/browse/DRE-258 .

We have successfully loaded rules of well-formed /boot/config/firewall.json in HostOS and GuestOS.

Details:
We would like to allow the node operator / provider to modify the firewall rules in order to allow incoming traffic to their nodes. This can be used to get the ability to fetch logs and metrics from the nodes in the same DC.

To achieve this, the node provider should add a new config file named firewall.json in the SetupOS config. This file is then propagated to the other parts of the IC-OS stack.

Note that the firewall rules can only be configured during node (re)deployment, in this phase. In phase 2, it will become possible to reconfigure already running nodes. Phase 2 will be more effort than phase 1.

@DFINITYManu DFINITYManu changed the title DRE-252: First phase of the firewall setup feature feat(icos) DRE-252: First phase of the firewall setup feature Sep 11, 2024
@DFINITYManu DFINITYManu changed the title feat(icos) DRE-252: First phase of the firewall setup feature feat(icos): First phase of the firewall setup feature (DRE-258) Sep 11, 2024
@github-actions github-actions bot added the feat label Sep 11, 2024
@DFINITYManu DFINITYManu force-pushed the firewallingsetupos branch 8 times, most recently from 31603d7 to ca53f81 Compare September 12, 2024 15:46
rs/ic_os/config/src/types.rs Outdated Show resolved Hide resolved
ic-os/setupos/data/deployment.json.template Outdated Show resolved Hide resolved
ic-os/components/setupos-scripts/devices.sh Outdated Show resolved Hide resolved
ic-os/components/setupos-scripts/devices.sh Outdated Show resolved Hide resolved
rs/ic_os/config/src/types/firewall.rs Show resolved Hide resolved
rs/ic_os/config/src/types/firewall.rs Show resolved Hide resolved
rs/ic_os/config/src/firewall_json.rs Show resolved Hide resolved
rs/ic_os/config/src/lib.rs Outdated Show resolved Hide resolved
rs/ic_os/utils/Cargo.toml Outdated Show resolved Hide resolved
@DFINITYManu DFINITYManu force-pushed the firewallingsetupos branch 2 times, most recently from 2c44825 to c849b6e Compare September 13, 2024 17:22
@DFINITYManu DFINITYManu force-pushed the firewallingsetupos branch 5 times, most recently from 773291f to f9c86eb Compare September 13, 2024 18:32
@DFINITYManu
Copy link
Contributor Author

DFINITYManu commented Sep 13, 2024

We don't yet do anything with the firewall.json file, but all this work, already, is enormous just to pass a silly config file down the pipe. Horrible. I am so looking forward to your work @andrewbattat enabling the use of the configuration structures making this job enormously easier and deleting so much code from our codebase.

@DFINITYManu DFINITYManu force-pushed the firewallingsetupos branch 2 times, most recently from 9121ce5 to 0ec762c Compare September 13, 2024 19:00
Copy link
Contributor

Vulnerable dependency information
The dependency-check job for the MR has new findings. Please update or remove these dependencies or obtain a commit exception from product security.

The findings are:
[Finding(repository='ic', scanner='BAZEL_RUST', vulnerable_dependency=Dependency(id='https://crates.io/crates/mio', name='mio', version='0.8.10', fix_version_for_vulnerability={'https://rustsec.org/advisories/RUSTSEC-2024-0019': ['>=0.8.11', '<0.7.2']}), vulnerabilities=[Vulnerability(id='https://rustsec.org/advisories/RUSTSEC-2024-0019', name='RUSTSEC-2024-0019', description='Tokens for named pipes may be delivered after deregistration', score=-1, risk_note=' ')], first_level_dependencies=[], projects=[], risk_assessor=[], risk=None, owning_teams=[], patch_responsible=[], due_date=None, score=-1, more_info=None), Finding(repository='ic', scanner='BAZEL_RUST', vulnerable_dependency=Dependency(id='https://crates.io/crates/rsa', name='rsa', version='0.9.2', fix_version_for_vulnerability={}), vulnerabilities=[Vulnerability(id='https://rustsec.org/advisories/RUSTSEC-2023-0071', name='RUSTSEC-2023-0071', description='Marvin Attack: potential key recovery through timing sidechannels', score=5, risk_note=' ')], first_level_dependencies=[], projects=[], risk_assessor=[], risk=None, owning_teams=[], patch_responsible=[], due_date=None, score=5, more_info=None)]

@venkkatesh-sekar
Copy link
Member

venkkatesh-sekar commented Sep 26, 2024

Vulnerable dependency information

This was due to a mixup in testing. Should be resolved now.

@DFINITYManu
Copy link
Contributor Author

@andrewbattat for some reason Github refuses to show me the thread where you and I were chatting about the empty firewall.json file. I still need your stamp in order to merge though. What do we do?

@andrewbattat
Copy link
Member

@andrewbattat for some reason Github refuses to show me the thread where you and I were chatting about the empty firewall.json file. I still need your stamp in order to merge though. What do we do?

Here it is: #1451 (comment)

I unresolved a few threads, and I'll do another review later today.

@@ -125,6 +138,21 @@ pub fn main() -> Result<()> {
println!("{}", to_cidr(ipv6_address, network_info.ipv6_subnet));
Ok(())
}
Some(Commands::CheckFirewallConfig { firewall_file }) => {
let _config_ = firewall_json::get_firewall_rules_json_or_default(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: omit the unused _config variable

Suggested change
let _config_ = firewall_json::get_firewall_rules_json_or_default(
firewall_json::get_firewall_rules_json_or_default(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may actually get the compiler to omit the call altogether, since "it has no effects".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMK if we have to fix anyway.

Comment on lines +111 to +119
if [ -f "${DEFAULT_FIREWALL_FILE}" ]; then
echo "* Checking firewall rules in ${DEFAULT_FIREWALL_FILE}..."
/opt/ic/bin/setupos_tool check-firewall-config "${DEFAULT_FIREWALL_FILE}" >/dev/null || {
ret=$?
echo >&2 "Failed to parse default firewall rules file ${DEFAULT_FIREWALL_FILE}."
return ${ret}
}
fi
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use log_and_halt_installation_on_error to handle errors (see the rest of this file)

Also, can we not redirect the output from the check-firewall-config to /dev/null/?

Suggested change
if [ -f "${DEFAULT_FIREWALL_FILE}" ]; then
echo "* Checking firewall rules in ${DEFAULT_FIREWALL_FILE}..."
/opt/ic/bin/setupos_tool check-firewall-config "${DEFAULT_FIREWALL_FILE}" >/dev/null || {
ret=$?
echo >&2 "Failed to parse default firewall rules file ${DEFAULT_FIREWALL_FILE}."
return ${ret}
}
fi
fi
if [ -f "${DEFAULT_FIREWALL_FILE}" ]; then
echo "* Checking firewall rules in ${DEFAULT_FIREWALL_FILE}..."
/opt/ic/bin/setupos_tool check-firewall-config "${DEFAULT_FIREWALL_FILE}"
log_and_halt_installation_on_error "${?}" "Failed to parse default firewall rules file ${DEFAULT_FIREWALL_FILE}."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went and added that wherever I remembered that was necessary. If I missed one, I would appreciate an honest look on that.

@@ -26,6 +26,20 @@ function copy_config_files() {
log_and_halt_installation_on_error "1" "Configuration file 'config.ini' does not exist."
fi

if [ -v FIREWALL_FILE ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of the FIREWALL_FILE ev? Why don't we just copy over /var/ic/config/firewall.json if it exists?

Comment on lines +46 to +63
fn get_firewall_rules_json(firewall_file: &Path) -> Result<Vec<FirewallRule>, FirewallRulesError> {
let file = match File::open(firewall_file) {
Ok(file) => file,
Err(e) => {
return Err(FirewallRulesError::IOError((
firewall_file.to_path_buf(),
e,
)))
}
};
match serde_json::from_reader(&file) {
Ok(val) => Ok(val),
Err(e) => Err(FirewallRulesError::ParseError((
firewall_file.to_path_buf(),
e,
))),
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this code be simplified using ? and map_error

fn get_firewall_rules_json(firewall_file: &Path) -> Result<Vec<FirewallRule>, FirewallRulesError> {
    let file = File::open(firewall_file).map_err(|e| {
        FirewallRulesError::IOError((firewall_file.to_path_buf(), e))
    })?;

    serde_json::from_reader(&file).map_err(|e| {
        FirewallRulesError::ParseError((firewall_file.to_path_buf(), e))
    })
}

Comment on lines +109 to +116
macro_rules! bad_rules_must_be_bad {
($text:literal) => {
let temp_file = temp_fixture($text)?;
let outp = get_firewall_rules_json(temp_file.path());
assert!(outp.is_err());
Ok(())
};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
macro_rules! bad_rules_must_be_bad {
($text:literal) => {
let temp_file = temp_fixture($text)?;
let outp = get_firewall_rules_json(temp_file.path());
assert!(outp.is_err());
Ok(())
};
}
fn check_invalid_firewall_rule(text: &str) -> Result<()> {
let temp_file = temp_fixture(text)?;
let outp = get_firewall_rules_json(temp_file.path());
assert!(outp.is_err());
Ok(())
}

I find a helper function much more readable here

Comment on lines +7 to +8
use std::path::Path;
use std::path::PathBuf;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use std::path::Path;
use std::path::PathBuf;
use std::path::{Path,PathBuf};

The setup routine uses /dev/vda to set up HostOS and GuestOS.
Unfortunately, that drive contains the setup partition, which
is happily destroyed by the installer as the installer proceeds.
When time comes to reboot, a bunch of gobbledygook is printed
to the console, and the system just hangs uselessly after that.

To work around that issue and permit installation of the OS in the
QEMU environment, we create an empty disk (sparse) and use that as
the first disk during QEMU invocation.  This solves the issue of
SetupOS destroying itself prior to reboot.

We also increase startup performance by avoiding the copy of the
tarball that contains the SetupOS image, and by asking tar to be
sparse with data copies whenever it can be.

Finally, we attempt to clean up the folder upon completion of the
execution.  This avoids large files remaining around after setup.

It's truly space-efficient eh!

```
manuel@devenv-manuel:~$ du -h /tmp/tmp.FO1kRm0gqF.qemu-launch-remote-vm/*
2.5G    /tmp/tmp.FO1kRm0gqF.qemu-launch-remote-vm/disk.img
1.6G    /tmp/tmp.FO1kRm0gqF.qemu-launch-remote-vm/target.img
```
In *virtualized, non-accelerated QEMU*, the write to disk goes like this:

```
root@SetupOS:~#     /opt/ic/bin/install-hostos.sh
install-hostos.sh - Start

* Installing HostOS disk-image...
* HostOS will be deployed to /dev/vda
* Temporarily extracting the HostOS image to RAM; please stand by for a few seconds
* Writing the HostOS image to /dev/vda
107088969728 bytes (107 GB, 100 GiB) copied, 188 s, 570 MB/s
25650+1 records in
25650+1 records out
107585994752 bytes (108 GB, 100 GiB) copied, 190.092 s, 566 MB/s
* Configuring EFI...
* Resizing partition...

install-hostos.sh - End (00:04:28)
```
# dd will detect nulls in chunks of 4M and sparsify the writes.
# In *non-KVM-accelerated* VM, this goes 500 MB/s, three times as fast as before.
echo "* Writing the GuestOS image to ${LV}"
dd if="${TMPDIR}/disk.img" of=${LV} bs=10M conv=sparse status=progress
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewbattat Do you remember why you chose pv over dd? I can't get into GitLab to see the discussion on the original MR.

Copy link
Member

@andrewbattat andrewbattat Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2f0db1c
It was for the sake of the installation progress bar (my first real node team task 🥲)

But now that the installation is so much faster, maybe the progress bar isn't necessary

Either way, this change should be in a separate PR in my opinion.

Comment on lines +567 to +568
truncate -s 128G target.img
qemu-system-x86_64 -machine type=q35,accel=kvm -enable-kvm -nographic -m 4G -bios /usr/share/ovmf/OVMF.fd -device vhost-vsock-pci,guest-cid=\\$$CID -boot d -drive file=target.img,format=raw,if=virtio -drive file=disk.img,format=raw,if=virtio -netdev user,id=user.0,hostfwd=tcp::2222-:22 -device virtio-net,netdev=user.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This same tooling is used for testing HostOS and GuestOS as well. If you'd like to add this for testing SetupOS, could you add a branch to the logic? We have a similar thing above for the remote VMs, in setting the nested flag.

These changes aren't really related to the firewall feature, could you please move them to a separate PR?

cd \\$$TEMP
tar xf disk-img.tar
qemu-system-x86_64 -machine type=q35,accel=kvm -enable-kvm -nographic -m 4G -bios /usr/share/ovmf/OVMF.fd -device vhost-vsock-pci,guest-cid=\\$$CID -drive file=disk.img,format=raw,if=virtio -netdev user,id=user.0,hostfwd=tcp::2222-:22 -device virtio-net,netdev=user.0
tar xSf $$IMAGE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think GNU tar that we use in devenvs/build container supports sparse extraction.

This option is meaningful only when creating or updating archives. It has no effect on extraction.

https://www.gnu.org/software/tar/manual/html_node/sparse.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants